Estimating Criterion-referenced Standards for Multiple-choice Examinations

نویسندگان

  • Francis P. Hughes
  • Charles F. Schumacher
  • Benjamin D. Wright
چکیده

Four methods were studied for setting a standard on a written examination containing several clinical discipline subtests. The NBME method yielded the most consistent estimate. The Angoff and Ebel methods yielded slightly less consistent estimates. All estimates were more consistent when computed as the average of discipline standards rather than the judges' personal standards. All but the Essential Content method yielded similar and practical estimates of the standard. The Ebel method was not found to be feasible for use with the examination studied; however, incorporating clusters of equally difficult, relevant items and feedback based on the judges' previous judgments into the NBME method yielded subsequent estimates of the standard that were more consistent than, but not significantly different from, the first estimate. The findings suggest the importance of enabling judges to select the items which will be most meaningful to them in expressing their judgments, the benefit of presenting clusters of equally difficult and relevant items and the utility of providing the judges feedback about their judgments and the opportunity to reconsider them. The study also showed that Rasch calibration and equating procedures provide a feasible methodology for expressing judgments based on different item samples on a common measurement scale. ESTIMATING CRITERION-REFERENCED STANDARDS FOR MULTIPLE-CHOICE EXAMINATIONS Francis P. Hughes, Ph.D. Charles F. Schumacher, Ph.D. Benjamin D. Wright, Ph.D.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Setting and maintaining standards in multiple choice examinations: AMEE Guide No. 37.

The process of setting a standard when pass/fail decisions have to be made inevitably involves judgment about the point on the test score scale where performance is deemed to be adequate for the purpose for which the examination is set. As with any process which involves human judgment, setting this standard is likely to include a certain degree of error, which may result in some false positive...

متن کامل

Review of criterion-referenced standards for cardiorespiratory fitness: what percentage of 1 142 026 international children and youth are apparently healthy?

PURPOSE To identify criterion-referenced standards for cardiorespiratory fitness (CRF); to estimate the percentage of children and youth that met each standard; and to discuss strategies to help improve the utility of criterion-referenced standards for population health research. METHODS A search of four databases was undertaken to identify papers that reported criterion-referenced CRF standa...

متن کامل

Guessing in Multiple Choice Questions: Challenges and Strategies

Introduction: Guessing is one of the most challenging issues in multiple choice questions. Several strategies, such as negative scoring, have been suggested for preventing students from choosing the right answer just by chance. However, there is no general agreement on using such strategies. The aim of this study was to review the scoring methods which are used for reducing guessing, and evalua...

متن کامل

The Impact of Correction for Guessing Formula on MC and Yes/No Vocabulary Tests' Scores

A standard correction for random guessing (cfg) formula on multiple-choice and Yes/Noexaminations was examined retrospectively in the scores of the intermediate female EFL learners in an English language school. The correctionwas a weighting formula for points awarded for correct answers,incorrect answers, and unanswered questions so that the expectedvalue of the increase in test score due to g...

متن کامل

Educational Testing: Measuring and Remedying Achievement Gaps

Achievement gaps, as measured by standardized tests, are inextricably related to educational goals, standards, norms, and benchmarks for student learning outcomes. I revisit conventional approaches to educational testing to measure achievement gaps—norm-referenced, criterion-referenced, and potential-referenced tests. I explore and discuss a paradigm shift from “passive” tests to “responsive” t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010